Design and Realization of Mongolian Syntactic Retrieval System Based on Dependency Treebank
نویسنده
چکیده
In the past seven years, Language Research Institute of Inner Mongolia University has constructed a 500,000word scale Mongolian dependency treebank. The syntactic treebank provides a favorable data platform for language research and information processing. In order to effectively use the treebank, we have designed and implemented a graphical syntactic information retrieval system based on the Mongolian dependency treebank. As an application system, this retrieval system offers search and statistical analysis on word, phrase, syntactic fragment and syntactic structure level. Keywords—Mongolian Language; Dependency Grammar; Dependency Treebank; Syntactic Retrieval; Information Retrieval
منابع مشابه
An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملFeature Engineering in Persian Dependency Parser
Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...
متن کاملA Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features
Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...
متن کاملExploiting catenae in a parallel treebank alignment
This paper aims to introduce the issues related to the syntactic alignment of a dependency-based multilingual parallel treebank, ParTUT. Our approach to the task starts from a lexical mapping and then attempts to expand it using dependency relations. In developing the system, however, we realized that the only dependency relations between the individual nodes were not sufficient to overcome som...
متن کاملCreating a Dependency Syntactic Treebank: Towards Intuitive Language Modeling
In this paper we present a user-centered approach for defining the dependency syntactic specification for a treebank. We show that by collecting information on syntactic interpretations from the future users of the treebank, we can model so far dependency-syntactically undefined syntactic structures in a way that corresponds to the users’ intuition. By consulting the users at the grammar defini...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015